Skip to main content

Error Tracking

What You Will Learn

  • Why logging exceptions is not the same as tracking errors
  • How to configure the Sentry Python SDK for a FastAPI service
  • How to enrich every error with user context, request metadata, and feature flags
  • How to use breadcrumbs to reconstruct the events leading up to an error
  • How to write before_send hooks to filter and enrich errors programmatically
  • How to group related errors using custom fingerprints
  • How to connect error tracking to your release pipeline for regression detection
  • How to build an error triage workflow that actually resolves bugs

Prerequisites

RequirementDetails
Python 3.11+Type hints throughout
FastAPIAll examples use FastAPI
sentry-sdk[fastapi]pip install "sentry-sdk[fastapi]"
Lessons 01–03 completeLogging context and correlation IDs assumed

The Incident: Six Weeks, One Error, Zero Resolution

A production Python service has been logging this message for six weeks:

ERROR:app.services.classifier:Something went wrong

Ten thousand times a day. Nobody knows what it is. Nobody has resolved it. Here is why:

  1. The log has no stack trace - the developer wrote logger.error("Something went wrong") without exc_info=True
  2. There is no user context - you cannot tell if it affects one user or all users
  3. There is no grouping - you cannot tell if it is one bug or fifty different bugs producing the same message
  4. There is no alert - it has been happening since before anyone set up alerting
  5. There is no assignment - nobody owns it

Six weeks later, a new developer finds the code:

try:
result = self.model.classify(text)
except Exception as e:
logger.error("Something went wrong") # The bug
return {"category": "unknown"}

The exception is a KeyError when the model returns an unexpected category. It has been silently corrupting classification results for six weeks. Revenue impact: unmeasured, probably significant.

Error tracking is not logging. It is a discipline of capturing, grouping, alerting on, and owning every exception your service raises.

1. Sentry Python SDK Setup

Sentry captures exceptions with full context: stack trace, local variables, request data, user information, and environment metadata. It groups similar exceptions together, shows you their frequency and trend, and alerts you when new errors appear.

Installation

# Core SDK + FastAPI integration
pip install "sentry-sdk[fastapi]"

# For SQLAlchemy query capture in error context
pip install "sentry-sdk[sqlalchemy]"

Basic Initialisation

# app/sentry_config.py
import sentry_sdk
from sentry_sdk.integrations.fastapi import FastApiIntegration
from sentry_sdk.integrations.starlette import StarletteIntegration
from sentry_sdk.integrations.sqlalchemy import SqlalchemyIntegration
from sentry_sdk.integrations.logging import LoggingIntegration
import logging

def setup_sentry(
dsn: str,
environment: str,
release: str,
traces_sample_rate: float = 0.1,
profiles_sample_rate: float = 0.0,
) -> None:
"""
Initialise Sentry error tracking.

Args:
dsn: Sentry project DSN from your Sentry project settings
environment: "production", "staging", "development"
release: Version string, typically "{app}@{git_sha}"
traces_sample_rate: Fraction of transactions to profile (0.0–1.0)
profiles_sample_rate: Fraction of sampled transactions to profile
"""
# Capture WARNING-level and above log messages as breadcrumbs
# Capture ERROR-level and above as Sentry events (exceptions)
logging_integration = LoggingIntegration(
level=logging.WARNING, # Breadcrumb level
event_level=logging.ERROR, # Event level (creates a Sentry issue)
)

sentry_sdk.init(
dsn=dsn,
environment=environment,
release=release,

# Integrations: auto-capture errors from these frameworks
integrations=[
StarletteIntegration(transaction_style="endpoint"),
FastApiIntegration(transaction_style="endpoint"),
SqlalchemyIntegration(),
logging_integration,
],

# Performance monitoring (APM)
traces_sample_rate=traces_sample_rate,
profiles_sample_rate=profiles_sample_rate,

# Error filtering and enrichment
before_send=before_send_hook,
before_send_transaction=before_send_transaction_hook,

# Attach request data (careful with PII - see before_send)
send_default_pii=False, # Don't send cookies, IP, email by default

# Capture 10 server variables in the error context
max_request_body_size="medium",

# Ignore health checks and metrics noise
traces_sampler=custom_traces_sampler,
)

Custom Traces Sampler

def custom_traces_sampler(sampling_context: dict) -> float:
"""
Determine the sample rate for each transaction individually.
This gives us more control than a flat `traces_sample_rate`.
"""
transaction_name = sampling_context.get("transaction_context", {}).get("name", "")

# Never trace health checks or metrics endpoints
if any(path in transaction_name for path in ["/health", "/metrics", "/liveness", "/readiness"]):
return 0.0

# Always trace errors (Sentry will capture them anyway, but this enriches with spans)
if sampling_context.get("parent_sampled") is True:
return 1.0

# Trace 10% of normal requests
return 0.10

FastAPI Integration

# app/main.py
import os
from fastapi import FastAPI
from app.sentry_config import setup_sentry

app = FastAPI()

# Sentry must be initialised before the app handles any requests
# Use git_sha injected at build time via environment variable
release = f"document-api@{os.environ.get('GIT_SHA', 'unknown')}"

setup_sentry(
dsn=os.environ["SENTRY_DSN"],
environment=os.environ.get("ENVIRONMENT", "development"),
release=release,
traces_sample_rate=0.1,
)

# The FastApiIntegration automatically captures exceptions from all routes.
# You do not need SentryAsgiMiddleware separately when using the integration.

What a Captured Exception Looks Like in Sentry

When an unhandled exception reaches Sentry, it creates an Issue with:

Title: KeyError: 'unknown_category'
Culprit: app.services.classifier in classify

Stack Trace:
File "app/api/routes/documents.py", line 87, in classify_document
result = await classifier.classify(text)

File "app/services/classifier.py", line 43, in classify
return CATEGORY_MAP[raw_label] ← KeyError here

File "app/services/classifier.py", line 43, in classify
raw_label = self.model.predict(text)["label"]

Local Variables at Error Frame:
text = "Quarterly earnings report for Q4..."
raw_label = "business_finance" ← Not in CATEGORY_MAP
CATEGORY_MAP = {"technology": ..., "science": ..., "sports": ...}

Request:
Method: POST
URL: /api/documents/classify
Body: {"text": "Quarterly earnings report..."}

Tags:
environment: production
release: document-api@a3b8f2c

Events: 10,247 times in last 7 days
Users affected: 3,891
First seen: 2026-01-19

In six lines of Sentry output, you know: exactly which exception, which line of code, which input caused it, how many users are affected, and when it started. The six-week mystery resolves in 30 seconds.

2. Enriching Error Context

Sentry captures the stack trace automatically. You need to add the application context that makes the error actionable.

Setting User Context

# app/middleware/sentry_context.py
import sentry_sdk
from fastapi import Request
from starlette.middleware.base import BaseHTTPMiddleware

class SentryContextMiddleware(BaseHTTPMiddleware):
"""
Enriches every Sentry event with user and request metadata.
Must run after your authentication middleware.
"""

async def dispatch(self, request: Request, call_next):
# Set user context - appears in every Sentry event during this request
user = getattr(request.state, "user", None)
if user:
sentry_sdk.set_user({
"id": user.id,
"email": user.email, # Only if not PII-sensitive in your region
"username": user.username,
"subscription_tier": user.subscription_tier,
"organisation_id": user.organisation_id,
})

# Set request-level tags (low cardinality, appear in Sentry filters)
sentry_sdk.set_tag("request.id", request.headers.get("X-Request-ID", "unknown"))
sentry_sdk.set_tag("api.version", request.headers.get("X-API-Version", "v1"))

# Set context blocks (rich structured data visible in the error detail)
sentry_sdk.set_context("request_metadata", {
"client_ip": request.client.host if request.client else "unknown",
"user_agent": request.headers.get("user-agent", ""),
"content_type": request.headers.get("content-type", ""),
"request_id": request.headers.get("X-Request-ID", ""),
})

return await call_next(request)

Programmatic Context in Route Handlers

# app/api/routes/documents.py
import sentry_sdk
from fastapi import APIRouter, UploadFile, Depends

router = APIRouter()

@router.post("/api/documents/classify")
async def classify_document(
file: UploadFile,
current_user: User = Depends(get_current_user),
):
content = await file.read()

# Add feature flags to Sentry context
sentry_sdk.set_context("feature_flags", {
"new_classifier_v2": get_feature_flag("new_classifier_v2", current_user),
"async_processing": get_feature_flag("async_processing", current_user),
})

# Add business-level extra data
sentry_sdk.set_extra("document_metadata", {
"filename": file.filename,
"size_bytes": len(content),
"content_type": file.content_type,
})

try:
result = await classifier.classify(content)
return result
except ClassificationError as exc:
# Capture with additional context at the point of failure
with sentry_sdk.push_scope() as scope:
scope.set_tag("error.category", "classification")
scope.set_extra("raw_model_output", exc.raw_output)
scope.set_extra("model_version", exc.model_version)
sentry_sdk.capture_exception(exc)
raise HTTPException(status_code=422, detail="Classification failed")

Sentry Context Types

MethodPurposeVisible in Sentry
sentry_sdk.set_user({"id": "..."})Identifies the userUser tab, issue filters
sentry_sdk.set_tag("key", "value")Low-cardinality labels for filteringTags tab, search bar
sentry_sdk.set_context("name", {...})Rich structured data blocksAdditional Data tab
sentry_sdk.set_extra("key", value)Arbitrary extra dataAdditional Data tab
sentry_sdk.capture_message("msg", "error")Manually capture a message as an eventCreates a Sentry issue
sentry_sdk.capture_exception(exc)Manually capture an exceptionCreates a Sentry issue

3. Breadcrumbs

Breadcrumbs are a trail of events that led up to the error. They answer the question "what was the service doing in the 10 seconds before the exception?"

Sentry automatically collects breadcrumbs from:

  • HTTP requests (via the requests/httpx/urllib3 integration)
  • Database queries (via the SQLAlchemy integration)
  • logging module calls at WARNING and above

You add custom breadcrumbs for your application logic:

# app/services/document_processor.py
import sentry_sdk

class DocumentProcessor:

async def process(self, content: bytes, filename: str) -> Document:
# Add a breadcrumb for each significant step
sentry_sdk.add_breadcrumb(
category="document",
message=f"Processing started: {filename}",
data={
"filename": filename,
"size_bytes": len(content),
},
level="info",
)

content_type = await self._detect_content_type(content)
sentry_sdk.add_breadcrumb(
category="document",
message="Content type detected",
data={"content_type": content_type},
level="info",
)

# Before an external API call - if it fails, the breadcrumb shows context
sentry_sdk.add_breadcrumb(
category="http",
message="Calling embedding API",
data={
"url": "https://api.openai.com/v1/embeddings",
"model": "text-embedding-3-small",
"input_tokens": len(content.decode("utf-8", errors="replace")) // 4,
},
level="info",
type="http",
)
embeddings = await self._get_embeddings(content)

# Before a database write
sentry_sdk.add_breadcrumb(
category="db",
message="Inserting document into database",
data={"table": "documents", "operation": "INSERT"},
level="info",
type="query",
)
doc = await self._store(filename, content_type, embeddings)

sentry_sdk.add_breadcrumb(
category="document",
message=f"Processing completed: {doc.id}",
data={"document_id": doc.id},
level="info",
)
return doc


async def _run_cache_operation(self, key: str) -> bytes | None:
"""Demonstrate cache miss breadcrumb."""
cached = await self.cache.get(key)
sentry_sdk.add_breadcrumb(
category="cache",
message=f"Cache {'hit' if cached else 'miss'}",
data={"key": key, "result": "hit" if cached else "miss"},
level="debug",
)
return cached

What Breadcrumbs Look Like in Sentry

When an error occurs after these breadcrumbs, Sentry shows:

BREADCRUMBS (last 10):
─────────────────────────────────────────────────────
09:14:31.100 [info] document Processing started: report.pdf size=204800
09:14:31.234 [info] document Content type detected content_type=application/pdf
09:14:31.240 [info] http GET https://redis:6379/cache_key → MISS
09:14:31.250 [info] http Calling embedding API model=text-embedding-3-small
09:14:31.891 [error] http POST https://api.openai.com → 429 Rate Limited
09:14:31.891 [warning] document Embedding API rate limited - retrying in 5s
09:14:36.900 [info] http Calling embedding API (retry 1)
09:14:37.441 [info] db Inserting document into database
09:14:37.789 [error] db Query failed: deadlock detected
─────────────────────────────────────────────────────
EXCEPTION: DeadlockError at app/services/document_processor.py:87

Without breadcrumbs, you see "DeadlockError" and have to guess why. With breadcrumbs, you immediately see: the OpenAI rate limit caused a 5-second retry delay, which meant the database transaction held locks for longer than usual, causing the deadlock.

4. Custom Error Grouping

Sentry groups errors by default using the stack trace fingerprint. Sometimes this default grouping is wrong:

  • "Connection pool exhausted" with 50 different stack frames → 50 separate Sentry issues, each with 1 occurrence. Should be 1 issue with 50 occurrences.
  • Timeout errors from different operations → grouped together even though they need different fixes.

Fingerprinting in before_send

def before_send_hook(event: dict, hint: dict) -> dict | None:
"""
Modify events before they are sent to Sentry.
Used for: filtering noise, enriching context, custom grouping.
"""
exc_info = hint.get("exc_info")
if exc_info is None:
return event

exc_type, exc_value, _ = exc_info

# Custom grouping: all connection pool exhaustion errors together
if "connection pool" in str(exc_value).lower() and "exhausted" in str(exc_value).lower():
event["fingerprint"] = ["connection-pool-exhausted"]
event.setdefault("tags", {})["error.category"] = "infrastructure"
return event

# Custom grouping: all validation errors by field name
if exc_type.__name__ == "ValidationError":
field_names = sorted(getattr(exc_value, "fields", []))
event["fingerprint"] = ["validation-error"] + field_names
return event

# Drop noise: client disconnection errors are not bugs
if exc_type.__name__ in ("ConnectionResetError", "BrokenPipeError"):
return None # Drop the event entirely

# Drop noise: rate limit errors from external APIs
if "rate limit" in str(exc_value).lower() or "429" in str(exc_value):
return None

# Drop noise: invalid user input (these are user errors, not service bugs)
if exc_type.__name__ in ("JSONDecodeError",) and "request" in str(exc_value).lower():
return None

return event

__sentry_grouping_hash__ on Exception Classes

Alternatively, put the grouping logic on the exception class itself:

# app/exceptions.py
class ConnectionPoolExhaustedError(Exception):
"""Raised when the database connection pool is exhausted."""

@property
def __sentry_grouping_hash__(self) -> str:
# All instances of this exception class group together,
# regardless of which code path triggered them
return "connection-pool-exhausted"


class DocumentValidationError(Exception):
"""Raised when a document fails validation."""

def __init__(self, message: str, field: str, value: str):
super().__init__(message)
self.field = field
self.value = value

@property
def __sentry_grouping_hash__(self) -> str:
# Group by field name - different fields are different bugs
return f"document-validation-error-{self.field}"

5. The before_send Hook

The before_send hook is called for every event before it leaves the process. It is your last chance to:

  • Filter out noise (return None to drop the event)
  • Mask PII that should not go to Sentry
  • Enrich the event with additional context
# app/sentry_hooks.py
import re
from typing import Optional

_EMAIL_RE = re.compile(r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b")
_CARD_RE = re.compile(r"\b(?:\d[ -]?){13,16}\b")
_JWT_RE = re.compile(r"\beyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\b")

# Exception types that are client errors, not service bugs - drop them
_NOISE_EXCEPTION_NAMES = frozenset({
"ConnectionResetError", # Client disconnected
"BrokenPipeError", # Client disconnected
"CancelledError", # asyncio task cancelled - usually client disconnect
})

# Exception message substrings that indicate noise
_NOISE_PATTERNS = [
"rate limit",
"too many requests",
"connection reset by peer",
"client disconnected",
]


def _mask_string(s: str) -> str:
"""Mask PII patterns in a string."""
s = _EMAIL_RE.sub("[EMAIL]", s)
s = _CARD_RE.sub("[CARD]", s)
s = _JWT_RE.sub("[JWT]", s)
return s


def _mask_dict(d: dict) -> dict:
"""Recursively mask PII in a dict (for request body, extra data)."""
result = {}
for key, value in d.items():
if isinstance(value, str):
result[key] = _mask_string(value)
elif isinstance(value, dict):
result[key] = _mask_dict(value)
elif isinstance(value, list):
result[key] = [
_mask_dict(i) if isinstance(i, dict)
else _mask_string(i) if isinstance(i, str)
else i
for i in value
]
else:
result[key] = value
return result


def before_send_hook(event: dict, hint: dict) -> Optional[dict]:
"""
Filter and enrich Sentry events before transmission.

Returns None to drop the event, or the (modified) event to send it.
"""
exc_info = hint.get("exc_info")

# Drop noise exceptions
if exc_info is not None:
exc_type, exc_value, _ = exc_info
if exc_type.__name__ in _NOISE_EXCEPTION_NAMES:
return None
exc_str = str(exc_value).lower()
if any(pattern in exc_str for pattern in _NOISE_PATTERNS):
return None

# Mask PII in request body
request = event.get("request", {})
if "data" in request:
if isinstance(request["data"], dict):
request["data"] = _mask_dict(request["data"])
elif isinstance(request["data"], str):
request["data"] = _mask_string(request["data"])

# Mask PII in extra data
if "extra" in event:
event["extra"] = _mask_dict(event["extra"])

# Remove sensitive headers
headers = request.get("headers", {})
for sensitive_header in ["authorization", "cookie", "x-api-key"]:
if sensitive_header in headers:
headers[sensitive_header] = "[FILTERED]"

# Add deployment metadata that might not be set in all environments
event.setdefault("tags", {}).update({
"python.version": f"{__import__('sys').version_info.major}.{__import__('sys').version_info.minor}",
})

return event


def before_send_transaction_hook(event: dict, hint: dict) -> Optional[dict]:
"""Filter performance transactions (APM spans) before sending."""
transaction = event.get("transaction", "")

# Drop health check and metrics transactions from APM
noisy_paths = ["/health", "/metrics", "/liveness", "/readiness", "/favicon.ico"]
if any(transaction.endswith(path) for path in noisy_paths):
return None

return event

6. Release Tracking

Without release tracking, Sentry cannot tell you which version introduced a regression. With it, you can see:

  • "This error was first seen in v2.14.0"
  • "Error rate increased by 300% after deploying v2.15.0"
  • "This issue was marked RESOLVED IN NEXT RELEASE and has reappeared in v2.16.0"

Setting the Release

import os
import subprocess

def get_git_sha() -> str:
"""Get the current git commit SHA, falling back to an env var."""
try:
return subprocess.check_output(
["git", "rev-parse", "--short", "HEAD"],
stderr=subprocess.DEVNULL,
).decode().strip()
except Exception:
return os.environ.get("GIT_SHA", "unknown")

# In setup_sentry():
release = f"document-api@{get_git_sha()}"

Sentry CLI: Creating Releases

In your CI/CD pipeline, after deploying a new version, create a release in Sentry:

# Install sentry-cli
pip install sentry-cli
# or: curl -sL https://sentry.io/get-cli/ | sh

# Authenticate
export SENTRY_AUTH_TOKEN=your_token
export SENTRY_ORG=your-org
export SENTRY_PROJECT=document-api

GIT_SHA=$(git rev-parse --short HEAD)
RELEASE="document-api@${GIT_SHA}"

# Create the release
sentry-cli releases new "${RELEASE}"

# Associate commits (shows which commits are in this release)
sentry-cli releases set-commits "${RELEASE}" --auto

# Mark the release as deployed to production
sentry-cli releases deploys "${RELEASE}" new \
--env production \
--started $(date +%s)

# (For JavaScript/TypeScript frontends: upload source maps here)
# sentry-cli releases files "${RELEASE}" upload-sourcemaps ./dist

Dockerfile Integration

# Dockerfile
FROM python:3.11-slim

# Build arg injected by CI
ARG GIT_SHA=unknown
ENV GIT_SHA=${GIT_SHA}

COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8001"]
# .gitlab-ci.yml
build:
stage: build
script:
- docker build --build-arg GIT_SHA=${CI_COMMIT_SHORT_SHA} -t myapp:${CI_COMMIT_SHORT_SHA} .
- docker push myapp:${CI_COMMIT_SHORT_SHA}

deploy:
stage: deploy
script:
- kubectl set image deployment/document-api app=myapp:${CI_COMMIT_SHORT_SHA}
- sentry-cli releases new "document-api@${CI_COMMIT_SHORT_SHA}"
- sentry-cli releases set-commits "document-api@${CI_COMMIT_SHORT_SHA}" --auto
- sentry-cli releases deploys "document-api@${CI_COMMIT_SHORT_SHA}" new --env production

7. Performance Monitoring in Sentry

Sentry APM (Application Performance Monitoring) captures transactions (requests) and their child spans. This overlaps with OpenTelemetry tracing from Lesson 03.

When to Use Sentry APM vs OpenTelemetry

AspectSentry APMOpenTelemetry + Jaeger
Setup complexityLow - already using Sentry SDKMedium - separate setup
Error correlationNative - traces link to errors automaticallyManual - inject trace ID into logs
Cross-service tracingYes, if all services use SentryYes, vendor-neutral, any backend
Vendor lock-inSentry proprietaryNone - OTel standard
Self-hosted optionGlitchTip (limited APM)Full OTel stack self-hosted
Best forSingle-service or simple architecturesMicroservices with diverse tech stacks

Recommendation: Use OpenTelemetry + Jaeger for distributed tracing (Lesson 03), and use Sentry for error tracking with its built-in performance data as supplementary context. Do not try to replace Jaeger with Sentry APM in a microservices environment.

Sentry Transactions (Quick Setup)

import sentry_sdk

# Manual transaction for a background job
with sentry_sdk.start_transaction(
name="nightly-reindex",
op="task",
) as transaction:
transaction.set_tag("job.type", "reindex")

with sentry_sdk.start_span(op="db.query", description="SELECT documents") as span:
docs = await db.fetch_all_documents()
span.set_data("document_count", len(docs))

with sentry_sdk.start_span(op="index.rebuild", description="Rebuild search index"):
await search_index.rebuild(docs)

8. Building an Error Workflow

Having Sentry configured is not enough. You need a process that ensures errors are seen, triaged, assigned, and resolved.

Error Workflow: From Alert to Resolution

New Error Detected


Sentry creates Issue (groups by fingerprint)


Alert fired (Slack / PagerDuty based on rules below)


On-call engineer acknowledges in Sentry

├── Is this noise? → Configure filter in before_send → IGNORE

├── Is this a known issue? → Link to existing ticket → TRACK

└── Is this a new bug?


Assign to owner (auto-assign by code owners or manually)


Engineer investigates using:
- Stack trace
- Breadcrumbs
- Affected users list
- Release that introduced it


Fix deployed with release tag


Mark RESOLVED IN NEXT RELEASE


Sentry watches for regression in next release

Sentry Alert Rules

Configure in Sentry UI under Project Settings → Alerts → Issue Alerts:

Alert 1: New Error in Production

Trigger: A new issue is created in environment=production
Action: Notify #backend-alerts Slack channel
Create PagerDuty incident (severity: warning)

Alert 2: Error Rate Spike

Trigger: Issue X occurs more than 100 times in 1 hour
Action: Notify #incidents Slack channel
Create PagerDuty incident (severity: critical)

Alert 3: Regression (resolved issue reappears)

Trigger: A resolved issue is seen again in the current release
Action: Notify the issue assignee and #backend-alerts

Alert 4: New users affected

Trigger: Issue affects more than 50 unique users
Action: Notify product manager + #incidents

Ownership Rules

Configure in Sentry Project Settings → Code Owners or Ownership Rules:

# Sentry CODEOWNERS file (similar to GitHub CODEOWNERS)
# Format: path/pattern team-or-user@sentry

app/api/routes/documents.py backend-team@yourorg
app/services/classifier.py ml-team@yourorg
app/services/payments.py payments-team@yourorg
app/db/ dba-team@yourorg

# Tag-based ownership
tags.component:payments payments-team@yourorg
tags.component:classifier ml-team@yourorg

Error Budget Tracking

Connect error tracking to your SLO (covered in Lesson 05). Your error budget is: if your SLO is 99.9% availability, you have 0.1% budget for errors. Sentry can track the percentage of sessions with errors:

SLO: < 0.1% of sessions encounter an error
Error budget per month: 43.8 minutes of downtime or 0.1% of requests

Current status (from Sentry dashboards):
Error rate this month: 0.047%
Error budget remaining: 53%
Burn rate: 0.94x (healthy - burning at less than 1x)

9. Self-Hosted Alternative: GlitchTip

GlitchTip is an open-source, Sentry-compatible error tracking server. It uses the same Sentry SDK - you only change the DSN to point at your own server.

docker-compose Setup

# docker-compose.yml (GlitchTip)
services:
glitchtip-db:
image: postgres:16
environment:
POSTGRES_DB: glitchtip
POSTGRES_USER: glitchtip
POSTGRES_PASSWORD: glitchtip_password

glitchtip-redis:
image: redis:7

glitchtip:
image: glitchtip/glitchtip:v4
ports:
- "9000:8000"
environment:
DATABASE_URL: postgresql://glitchtip:glitchtip_password@glitchtip-db:5432/glitchtip
REDIS_URL: redis://glitchtip-redis:6379
SECRET_KEY: your-secret-key-here
EMAIL_URL: smtp://user:[email protected]:587
GLITCHTIP_DOMAIN: http://localhost:9000
DEFAULT_FROM_EMAIL: [email protected]
depends_on:
- glitchtip-db
- glitchtip-redis

glitchtip-worker:
image: glitchtip/glitchtip:v4
command: ./bin/run-celery-with-beat.sh
environment:
DATABASE_URL: postgresql://glitchtip:glitchtip_password@glitchtip-db:5432/glitchtip
REDIS_URL: redis://glitchtip-redis:6379
SECRET_KEY: your-secret-key-here
depends_on:
- glitchtip-db
- glitchtip-redis

Pointing Your Python App at GlitchTip

sentry_sdk.init(
dsn="http://your_public_key@localhost:9000/1", # GlitchTip DSN
environment="production",
release=f"document-api@{get_git_sha()}",
integrations=[FastApiIntegration(), SqlalchemyIntegration()],
before_send=before_send_hook,
)

Everything else is identical - the SDK does not know it is talking to GlitchTip instead of Sentry.

GlitchTip vs Sentry Comparison

FeatureGlitchTip (self-hosted)Sentry (cloud)
CostFree (hosting costs only)From $26/mo
Error groupingYesYes (more advanced ML-based)
Performance APMLimitedFull
Source mapsYesYes
Release trackingYesYes
Data sovereigntyFull controlData in Sentry's cloud
Maintenance burdenYou manage upgradesNone
Best forTeams with data sovereignty requirementsTeams wanting zero maintenance

Complete Error Tracking Setup Checklist

# app/sentry_config.py - complete production setup

import os
import sys
import logging
import sentry_sdk
from sentry_sdk.integrations.fastapi import FastApiIntegration
from sentry_sdk.integrations.starlette import StarletteIntegration
from sentry_sdk.integrations.sqlalchemy import SqlalchemyIntegration
from sentry_sdk.integrations.logging import LoggingIntegration
from app.sentry_hooks import before_send_hook, before_send_transaction_hook, custom_traces_sampler


def setup_sentry() -> None:
dsn = os.environ.get("SENTRY_DSN")
if not dsn:
logging.getLogger(__name__).warning(
"SENTRY_DSN not set - error tracking disabled"
)
return

environment = os.environ.get("ENVIRONMENT", "development")
git_sha = os.environ.get("GIT_SHA", "unknown")
service_name = os.environ.get("SERVICE_NAME", "unknown-service")
release = f"{service_name}@{git_sha}"

sentry_sdk.init(
dsn=dsn,
environment=environment,
release=release,
integrations=[
StarletteIntegration(transaction_style="endpoint"),
FastApiIntegration(transaction_style="endpoint"),
SqlalchemyIntegration(),
LoggingIntegration(
level=logging.WARNING,
event_level=logging.ERROR,
),
],
traces_sampler=custom_traces_sampler,
before_send=before_send_hook,
before_send_transaction=before_send_transaction_hook,
send_default_pii=False,
max_breadcrumbs=50,
attach_stacktrace=True, # Attach stack trace to all events, not just exceptions
in_app_include=["app"], # Only show frames from 'app' package in stack trace
max_request_body_size="medium",
)

# Tag every event with Python version and runtime info
with sentry_sdk.configure_scope() as scope:
scope.set_tag("python.version", f"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}")
scope.set_tag("runtime", "cpython")

Interview Questions and Answers

Q1: Sentry is grouping two completely different bugs - a TimeoutError in the payment service and a TimeoutError in the document OCR service - into the same issue because they have the same exception class. How do you fix this without changing the code that raises the exceptions?

Configure a custom fingerprint in the before_send hook. Inspect the stack trace in the event to determine which module raised the exception, and use that as part of the fingerprint:

def before_send_hook(event, hint):
if exc_info := hint.get("exc_info"):
exc_type, _, _ = exc_info
if exc_type.__name__ == "TimeoutError":
# Use the module path of the innermost frame as part of the fingerprint
frames = event.get("exception", {}).get("values", [{}])[-1].get("stacktrace", {}).get("frames", [])
if frames:
module = frames[-1].get("module", "unknown")
event["fingerprint"] = ["TimeoutError", module]
return event

Alternatively, define separate exception subclasses (PaymentTimeoutError, OCRTimeoutError) and use __sentry_grouping_hash__ on each. This is the cleanest solution because it also improves code clarity.

Q2: You have a multi-tenant SaaS with 10,000 organisations. One organisation is generating 95% of your Sentry errors because their data is triggering an edge case. How do you see this in Sentry, and how do you handle the alerting noise while the bug is being fixed?

In Sentry, open the affected issue and click "Users" tab - it shows a breakdown of affected users and their organisations. You can filter by tag (if you set organisation_id as a tag). To reduce noise while fixing: (1) In the before_send hook, check if organisation_id matches the problematic organisation and add a tag noise=true, then configure a Sentry alert rule to exclude noise=true events. (2) Or use Sentry's "Mute until" feature to suppress alerts on this specific issue for 24 hours while the fix is deployed. (3) The cleanest engineering solution: add a before_send filter that samples this error type at 1% while the organisation tag matches the problematic org, keeping Sentry useful while reducing volume by 99%.

Q3: What is the difference between sentry_sdk.capture_exception(), sentry_sdk.capture_message(), and letting Sentry capture an exception automatically from an unhandled exception?

sentry_sdk.capture_exception(exc) manually sends an exception to Sentry from any point in your code - typically inside a try/except block where you handle the exception locally but still want to track it. The exception is marked as "handled." Unhandled exceptions captured automatically (when they bubble up to the FastAPI integration) are marked as "unhandled" - Sentry gives these higher severity by default. sentry_sdk.capture_message("text", level="error") sends a message without an exception - useful for non-exception alerts like "payment webhook signature validation failed" where you catch the case but want visibility. In general: let unhandled exceptions be captured automatically, use capture_exception for handled errors that you still want to track, and use capture_message sparingly for important non-exception events.

Q4: Your before_send hook is dropping about 40% of all Sentry events as noise. How do you verify that you are not accidentally dropping real bugs?

Several approaches: (1) Log every dropped event with logger.debug("sentry.event.dropped", exc_type=..., reason=...) - this creates a record of what is being filtered without sending it to Sentry. (2) Add a Prometheus counter sentry_events_dropped_total with a reason label - you can alert if the drop rate for a specific reason suddenly changes. (3) Periodically review the before_send code in code review, treating each return None as a policy decision that must be justified with a comment. (4) Run in "shadow mode" for a week in staging: modify before_send to send events to a separate Sentry project instead of dropping them, so you can compare what you're missing. (5) Implement a sampling-based drop: instead of dropping 100% of a noisy error type, drop 99% and let 1% through - this gives you visibility into whether the error type is changing.

Q5: A teammate argues that Sentry and structured logging (Lesson 01) are redundant - "if we have Sentry, why log the error at all?" How do you respond?

They solve different problems and complement each other. Sentry captures exceptions with full context - stack traces, breadcrumbs, user info, local variables - and groups, deduplicates, and alerts on them. It is optimised for the "what broke and why?" question. Structured logging captures a timestamped stream of events for every request, successful or not. It answers "what did the service do at 09:14:32?" - which includes the successful operations that provide context for understanding why the error occurred. Additionally: (1) Logs capture non-exception events (successful requests, cache hits, business events) that Sentry does not. (2) Logs are searchable by arbitrary fields in Loki/Kibana - Sentry search is limited to its own data model. (3) Logs correlate with metrics and traces via request_id and trace_id. (4) Logs are your audit trail; Sentry is your error inbox. You need both.

© 2026 EngineersOfAI. All rights reserved.